Ramesh Gopinath's Featurisation and Model Tuning AIML Project

1. Import and Understand the data

2. Data Cleansing

3. Data Analysis & Visualization

Observations:

1. Column 9, 14, 30, 170 seem to be normally distributed
2. Column 32, 33, 160 is normally distributed with a huge right skew
3. Column 31, 21, 22, 25, seem to have 2 clusters
4. Column 27 seems to have 2 clusters and is left skewed
5. Column 26 seems to have 3 clusters
6. Column 161 is right skewed
7. Colulmn 163, 164, 165, 166 have 2 clusters with 1 normally distrubuted and the other very small with a right skew

Observations:

1. There doesnt seem to be any significant linear relationship between the independent and dependent variables for most
of the columns

4. Data Pre-Processing

Training set has a Target distribution of 50.53 to 49.46 (Pass to Fail) ratio Validation set has a Target distribution of 49.35 to 50.65 (Pass to Fail) ratio and Finally Testing set has a Target distribution of 48.80 to 51.20 (Pass to Fail) ratio

So fairly ok distributed

5. Model Training, Testing and Tuning

6. Post Training and Conclusion